details widget name

EUDocLib

Same cat chapters
  • Technical documentation: Sites using Atlas
Chapter details

EUDocLib website URL -  http://eudoclib.atlasproject.eu

EUDocLib is a publicly accessible repository of EU law documents from the EUR-LEX collection. This web site (online service) is ment to provide enhanced navigation and easier access to relevant documents in the user's language. The Eur-Lex documents are processedd and refferenced to  their original URLs.

Implementation of EUDocLib online service is done with Atlas i‑Publisher software. A set of EU legal documents from the EUR-LEX collection is compiled and imported into the service database. The service is currently in Beta.

This service allows you to retrieve information, to easyly navigate through content, to search, to find similar documents from the EUR-LEX collection in English and later on - in Bulgarian, German, Greek, Polish and Romanian in an intelligent way.

The service automatically compiles a summary of each document, featuring all relevant information about it - common words and phrases, capitalized phrases, URLs, similar documents, extractive summary etc. 

Every document is catalogued in the Eurolex  major classification systems: eurovocTree, eurovocdescriptor and eurovocDirectoryCodeTree. In addition, our service “automagically” suggests topics the document most likely belongs to. The classification is based oBrowsen the extracted information for the document.

After a document is processed, it is indexed by a full-text search engine, based on Lucene. Using a simple, Google-like search form, you can quickly find words or phrases in all documents. The search results show excerpts from the text, best matching the search terms.

The service will guide you through all publications by finding similar documents in the supported languages. The similarity between two documents is calculated using the extracted essence of the documents.

At the moment of writing this chapter (12.2011) the service is implemented in English language due to the fact that the Language Processing chains are still in a process of final tests. After they are finalized all project languages and corresponding Language processing chains will be implemented and added to the serviceEUDocLib web site is built using Atlas i-Publisher in a separate domain where the specific content model is created.

Content model

Content model comprises of single content type called EUR-Lex document with the following attributes:

  • Document Date: date: The date when the document is issued
  • Document Title: text: Document title as in EURLex database
  • Reference: text: References such as pages, languages, etc
  • CELEX Number: text: Document CELEX number (the unique identifier of each document in EUR-Lex – check http://eur-lex.europa.eu/en/tools/faq.htm#1.12)
  • CELEX based Url: text: Link to the document web page on EUR-Lex website based on CELEX
  • Document text URL: text: Link to the document text on EUR-Lex website
  • Author: text: Author of the document
  • Form: text: Form of the document
  • Procedure: text: Text with described procedure for the document
  • Document text: file: Attached File attribute used for language processing and analysis
  • Bibliography text: file: Text file with bibliography of the document
  • Available in: attr. list: Available Languages for the document text

This content type maps the basic information of a single EUR-lex document.

 

Website

Functionality

The web site has the following functionality

  1. On the Home page user is presented with the search field to enter search term and as a result he/she receives list of documents containing this term. Details for each of the documents in this are displayed such as Categorization, Named entities, Similar Documents (and automatically generated Summary of the document, when this functionality is available).

  2. On Browse page the two EUR-Lex basic categorization trees – EurovocTree and EurovocDirectory - are presented to the user and can be look through for documents. Again each document details are displayed as in the previous case.

  3. Search index. All documents text content is indexed on all available project languages. This allows the visitor to do full text  searches.

Pages

In terms of i-Publisher software the EURLex web site consists of the following pages:

Master page
Master page for EUDocLib pages from the inner navigation

Sets the EUDocLib web site pages layout (header, navigation, main content area, footer) as well as the common graphical components such as logos, navigation, backgrounds.

Simple search
Home page of the web site

Search results
List of documents from a search result

List widget is added to the Main content area. It is configured to display list of documents returned from a preformed search along with the fragment of text where the searched phrase is found

Document Details 
Single document details and text analysis components

Browse
Categorization trees that can be expanded and browsed for documents

About 
Overview of the service

Advanced search n/a

Contacts and Support
Contact and support information

Copyright notice
EUR-Lex Copyright notice for document usage

Terms of Service
Terms for using this service